AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
FP8 Dynamic Quantization

# FP8 Dynamic Quantization

Qwen3 30B A3B FP8 Dynamic
FP8 dynamic quantization version based on Qwen/Qwen3-30B-A3B model, optimized for inference efficiency on Ampere architecture GPUs
Large Language Model Transformers
Q
khajaphysist
403
2
Llama Joycaption Alpha Two Hf Llava FP8 Dynamic
MIT
This is an FP8 compressed version of the Llama JoyCaption Alpha Two model developed by fancyfeast, implemented using the llm-compressor tool and compatible with the vllm framework.
Image-to-Text English
L
JKCHSTR
248
1
Magnum V4 72b FP8 Dynamic
Apache-2.0
A large language model with 72B parameters fine - tuned based on Qwen2.5 - 72B - Instruct. It uses dynamic FP8 quantization technology to optimize inference efficiency and aims to reproduce the prose quality of Claude 3.
Large Language Model Transformers English
M
Infermatic
2,106
2
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase